Automated Discovery of WordNet Relations
نویسنده
چکیده
The WordNet lexical database is now quite large and offers broad coverage of general lexical relations in English. As is evident in this volume, WordNet has been employed as a resource for many applications in natural language processing (NLP) and information retrieval (IR). However, many potentially useful lexical relations are currently missing from WordNet. Some of these relations, while useful for NLP and IR applications, are not necessarily appropriate for a general, domain-independent lexical database. For example, WordNet’s coverage of proper nouns is rather sparse, but proper nouns are often very important in application tasks. The standard way lexicographers find new relations is to look through huge lists of concordance lines. However, culling through long lists of concordance lines can be a rather daunting task (Church and Hanks, 1990), so a method that picks out those lines that are very likely to hold relations of interest should be an improvement over more traditional techniques. This chapter describes a method for the automatic discovery of WordNetstyle lexico-semantic relations by searching for corresponding lexico-syntactic patterns in large text collections. Large text corpora are now widely available, and can be viewed as vast resources from which to mine lexical, syntactic, and semantic information. This idea is reminiscent of what is known as “data mining” in the artificial intelligence literature (Fayyad and Uthurusamy, 1996), however, in this case the ore is raw text rather than tables of numerical data. The Lexico-Syntactic Pattern Extraction (LSPE) method is meant to be useful as an automated or semi-automated aid for lexicographers and builders of domain-dependent knowledge-bases. The LSPE technique is light-weight; it does not require a knowledge base or complex interpretation modules in order to suggest new WordNet relations.
منابع مشابه
Automatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملAutomated Discovery of Telic Relations for WordNet
A method is presented for automatically extending WordNet with the telic relationships proposed in Pustejovsky’s lexicon model. The method extracts telic relationships from WordNet glosses by first selecting a telic word through a pattern matcher aided by a part-of-speech tagger and then employing a word disambiguation module to select the specific meaning (synset) of the telic word. The method...
متن کاملLeveraging Morpho-semantics for the Discovery of Relations in Chinese Wordnet
Semantic relations of different types have played an important role in wordnet, and have been widely recognized in various fields. In recent years, with the growing interests of constructing semantic network in support of intelligent systems, automatic semantic relation discovery has become an urgent task. This paper aims to extract semantic relations relying on the in situ morpho-semantic stru...
متن کاملLearning Lexical Semantic Relations using Lexical Analogies — Extended Abstract
Linguistic ontologies, most notably WordNet [1], have been shown to be a valuable resource for a variety of natural language processing applications. Presently, linguistic ontologies are largely constructed by hand, which is both difficult and expensive. A central problem that demands an automated solution is the discovery and incorporation of lexical semantic relations, or semantic relations b...
متن کاملSurvey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery
this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004